Table-driven look-ahead lexical analysis
نویسنده
چکیده
Modern programming languages use regular expressions to define valid tokens. Traditional lexical analyzers based on minimum deterministic finite automata for regular expressions cannot handle the look-ahead problem. The scanner writer needs to explicitly identify the look-ahead states and code the buffering and re-scanning operations by hand. We identify the class of finite look-ahead finite automata, which is general enough to include all finite automata of practical lexical analyzers. Finite look-ahead finite automata are then transformed into suffix finite automata. A new lexical analyzer makes use of the suffix finite automata to identify tokens. The new lexical analyzer solves the look-ahead problem in a table-driven approach and it can detect lexical errors at an earlier time than traditional lexical analyzers. The extra cost of the new lexical analyzers is the larger state transition table and three additional 1-dimensional tables.
منابع مشابه
Mealy Machines are a Better Model of Lexical Analyzers
Abstract—Lexical analyzers partition input characters into tokens. When ambiguities arise during lexical analysis, the longest-match rule is generally adopted to resolve the ambiguities. The longest-match rule causes the look-ahead problem in traditional lexical analyzers, which are based on Moore machines. In Moore machines, output tokens are associated with states of the automata. By contrast...
متن کاملLook-ahead Levinson- and Schur-type Recurrences in the Padé Table∗
For computing Padé approximants, we present presumably stable recursive algorithms that follow two adjacent rows of the Padé table and generalize the well-known classical Levinson and Schur recurrences to the case of a nonnormal Padé table. Singular blocks in the table are crossed by look-ahead steps. Ill-conditioned Padé approximants are skipped also. If the size of these lookahead steps is bo...
متن کاملLook-ahead techniques for fast beam search
In this paper, we present two efficient look-ahead pruning techniques in beam search for large vocabulary continuous speech recognition. Both techniques, the language model look-ahead and the phoneme look-ahead, are incorporated into the word conditioned search algorithm using a bigram language model and a lexical prefix tree [5]. The paper present the following novel contributions: We describe...
متن کاملReducing time-synchronous beam search effort using stage based look-ahead and language model rank based pruning
In this paper, we present an efficient look-ahead technique based on both the Language Model (LM) Look-Ahead and the Acoustic Model (AM) Look-Ahead, for the time-synchronous beam search in the large vocabulary speech recognition. In this so-call stage based look-ahead (SLA) technique, two predicting processes with different hypothesis evaluating criteria are organized by stages according to the...
متن کاملImproved lexical tree search for large vocabulary speech recognition
This paper describes some extensions to the language model (LM) look-ahead pruning approach which is integrated into the time-synchronous beam search algorithm. The search algorithm is based on a lexical prefix tree in combination with a wordconditioned dynamic search space organization for handling trigram language models in a one-pass strategy. In particular, we study several LM look-ahead pr...
متن کامل